The Concept of α-Outliers in Structured Data Situations
نویسندگان
چکیده
In every statistical data analysis, somehow surprising observations can occur which deviate strongly from the remaining observations or the assumed model. On the one hand, these observations may contain important pieces of information about the data-generating process. On the other hand, they might simply be measurement or reporting errors. Regardless of which origin the observation has, it is commonly named “outlier”. There are numerous ways to detect outliers, with no strategy outperforming others in every situation. Besides non-parametric procedures, e.g., based on depth measures, also model-based strategies exist. In order to be able to detect outliers, it first needs to be specified what is meant by an outlier. In this contribution, we discuss the notion of α-outliers as introduced by Davies and Gather (1993). The basic idea is that there exists a pattern which is supported by the majority of the data. Observations which are strongly deviating from this pattern are understood as outliers. Within the α-outlier concept, the pattern is the statistical model one has in mind for the data generating mechanism. Observations which lie in a region with low probability and are thereby surprising are understood as outliers. The general idea of α-outliers can be applied to basically any statistical model. The so-called outlier region usually is uniquely defined for a given statistical distribution. However, within the analysis of observed data sets this is often only specified up to some unknown parameters of the assumed class of distributions, resulting in the necessity of outlier identification procedures. This chapter is structured as follows: Sect. 6.2 reviews the general definition of α-outlier regions. One-step approaches towards the detection of α-outliers in a
منابع مشابه
Impact of Outliers in Data Envelopment Analysis
This paper will examine the relationship between "Data Envelopment Analysis" and a statistical concept ``Outlier". Data envelopment analysis (DEA) is a method for estimating the relative efficiency of decision making units (DMUs) having similar tasks in a production system by multiple inputs to produce multiple outputs. An important issue in statistics is to identify the outliers. In this pap...
متن کاملروشهای تعیین دادههای پرت در مطالعات پزشکی
Background: An outlier is an observation that lies an abnormal distance from other values in a random sample from a population. Outliers sometimes deal with to abnormality in obtained results from collected data and information. known outlier data by researchers, physicians and other persons that work in medical fields and sciences is important and they must control data before getting result a...
متن کاملRobust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data
Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...
متن کاملIdentification of outliers types in multivariate time series using genetic algorithm
Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...
متن کاملInvestigation of outliers of evaluation scores among school of health instructors using outlier - determination indices
Introduction: Teacher evaluation, as an important strategyfor improving the quality of education, has been considered byuniversities and leads to a better understanding of the strengthsand weaknesses of education. Analysis of instructors’ scoresis one of the main fields of educational research. Since outliersaffect analysis and interpretation of information processes bothstructurally and concep...
متن کامل